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Abstract 


This  paper  is  concerned  with  the  numerical  behavior  of  Gauss-Newton  methods  for  nonlinear 
least-squares  problems.  Here  we  assume  that  the  defining  feature  of  a  Gauss- Newton  method 
is  that  the  direction  from  one  iterate  to  the  next  is  the  numerical  solution  of  a  particular  linear 
least-squares  problem,  with  a  steplength  subsequently  determined  by  a  linesearch  procedure.  It  is 
well  known  that  Gauss-Newton  methods  cannot  be  successfully  applied  to  nonlinear  least-squares 
problems  as  a  class  without  modification.  Our  purpose  is  to  give  specific  examples  illustrating 
some  of  the  difficulties  that  arise  in  practice  which  we  believe  have  not  been  fully  described  in 
the  literature. 


Acknowledgements 


I  would  like  to  thank  Philip  Gill,  Walter  Murray,  Joseph  Oliger,  and  Margaret  Wright  for 
discussion  and  comments  on  earlier  versions  of  this  paper.  Special  thanks  goes  to  Margaret 
Wright  for  all  kinds  of  help  with  the  present  version,  and  for  encouraging  me  to  submit  it  for 
publication. 

I  am  fortunate  to  have  had  generous  financial  support  for  this  research  in  the  form  of  a 
fellowship  from  the  Xerox  Corporation,  with  additional  funding  provided  by  Joseph  Oliger  under 
Office  of  Naval  Research  contract  N00Q14— 82-K-0335.  I  am  also  indebted  to  Stanford  Linear 
Accelerator  Center  for  the  use  of  their  computer  facilities. 


1.  Introduction 


The  nonlinear  least-squares  problem  is  given  by 


where  4>,(x)  are  real-valued  functions,  or,  equivalently, 


where 


/(*)* 


M*) 

fa(z) 


We  assume  that  each  fa  has  continuous  second  partial  derivatives.  The  function  |  ||/(*)||^  will 
be  called  the  least-squares  objective  function. 

The  classical  approach  to  nonlinear  least  squares,  called  the  Gauss-Newton  method,  locally 
approximates  each  residual  component  fa  of  /  by  a  linear  function,  using  the  relationship 

/(*  +  p)  *  /(*)  +  J(*)p+  o(IIp!Ij),  (i.i) 


where  J  is  the  Jacobian  matrix  of  /,  that  is 

/ ft  •••  tk\ 

J(*)sV/(«)  =  :  :  . 

... 

The  step  to  the  new  iterate  from  the  current  point  is  in  the  direction  of  the  vector  p  that  solves 

min  ||/  +  Jp\\]  ; 

in  other  words,  the  change  in  the  nonlinear  least-squares  objective  £  /T/  is  being  modeled  by 
the  quadratic  function 


where 


9rp+  ^PTJTJP, 


5*v(i/v)=^/. 
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Hence  the  Gauss-Newton  method  differs  from  Newton’s  method  in  that  the  Hessian  matrix 

v2  G/T/)  ssjT/+i>va* 

is  approximated  by  JT  J,  a  strategy  that  would  seem  reasonable  when  the  residuals  are  small. 

Although  it  is  well  known  that  the  Gauss-Newton  method  does  not  work  well  under  all 
circumstances,  it  is  not  possible  to  say  anything  more  precise  about  the  method  when  considering 
large  and  varied  sets  of  test  problems.  Detailed  numerical  study  is  essential  in  order  to  understand 
the  practical  shortcomings  of  the  Gauss- Newton  method.  In  this  paper  we  analyze  specific 
examples  of  performance  that  reveal  some  of  the  difficulties  that  may  be  encountered  in  practice. 

1.1.  Overview 

In  Section  2,  we  show  that  a  class  of  numerical  methods,  rather  than  a  single  method,  is 
defined  by  the  linearization  (1.1)  of  /,  and  motivate  these  methods  from  considerations  that 
arise  in  unconstrained  optimization  (see,  for  example,  Fletcher  [1980],  Gill,  Murray,  and  Wright 
[1981],  Dennis  and  Schnabel  [1983],  or  Mor<  and  Sorensen  [1984])  and  linear  least  squares  (see, 
for  example,  Stewart  [1973],  Lawson  and  Hanson  [1974],  or  Golub  and  Van  Loan  [1983]).  Section 
3  surveys  research  related  to  computational  aspects  of  Gauss- Newton  methods.  In  Section  4,  we 
give  a  general  description  of  how  the  numerical  results  presented  in  the  remaining  sections  of 
the  paper  were  obtained.  Examples  of  the  performance  of  Gauss-Newton  methods  on  problems 
with  ill-conditioned  Jacobians  are  presented  in  Section  5.  An  example  of  poor  performance  of  a 
Gauss- Newton  method  on  a  zero-residual  problem  with  a  well-conditioned  Jacobian  is  analyzed 
in  Section  6.  Tables  of  numerical  results  for  two  different  Gauss- Newton  methods  for  a  large  set 
of  test  problems  are  included  in  an  appendix. 

1.2.  Notation 

Generally  subscripts  on  a  function  mean  that  the  function  is  evaluated  at  the  corresponding 
subscripted  variable  (for  example,  /*  =  f(xk)).  An  exception  is  made  for  the  residual  functions 
4>i,  where  the  subscript  is  the  component  index  within  the  vector  /. 
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2.  Motivation 


The  Gauss-Newton  method  for  nonlinear  least  squares  can  be  viewed  as  a  modification  of 
Newton's  method  in  which  JT  J  is  used  to  approximate  the  Hessian  matrix  of  the  least-squares 
objective  function 

J(z)TJ(*)  +  £>(*)VaA(x). 

1=1 

Two  promising  aspects  of  this  approximation  are  that  computation  of  JT  J  involves  only  first 
derivatives,  and  that  JrJ  is  always  at  least  positive  semi-definite.  Moreover,  if  /(x*)  =  0  and 
is  positive-definite,  then  x*  is  an  isolated  local  minimum  and  the  method  is  locally 
quadratically  convergent.  To  see  this,  define 

1*  *  f>(x)Va*»(*), 

«=i 

( B  is  the  neglected  term  in  the  Hessian)  and  consider  the  expansion  of  (1.2): 

0  =  J(x*)T/(x-)  =  g  +  {JTJ  +  B)  (x  -  x*)  +  0( ||x  -  x*||2), 


which  is  valid  since  it  is  assumed  that  /  has  continuous  second  derivatives.  The  Gauss-Newton 
search  direction  at  the  current  iterate  minimizes  the  quadratic  function 


and  therefore  satisfies  the  equations 


fp  +  \pTJTJp> 


JTJp  =  -  g. 


(2.1) 


(2.2) 


Because  J(x*)TJ(x*)  is  positive  definite  and  J  is  continuous,  (JTJ)-1  exists  and  has  bounded 
norm  in  a  neighborhood  of  x*.  Hence  convergence  is  quadratic  when  ||(JTJ)  1  i?  |  is 
0(\\x  —  x*||).  In  particular,  quadratic  convergence  must  eventually  occur  whenever  /(x*)  =  0, 
because  then  ||/||  is  0(||x  —  x"||)  (and  so  is  ||2?||).  When  the  objective  vanishes  at  a  mini¬ 
mum,  (2.1)  is  an  C?||p||J  approximation  to  \  (|!/(x  +  p)||2  -  |j/(x)||^),  so  that  in  the  limit  the 
Gauss-Newton  direction  approaches  the  Newton  search  direction  pH,  which  satisfies 


(JTJ+  B)pN  =  -g. 


When  f(xm)  0,  the  Gauss-Newton  method  will  converge  linearly  if  the  smallest  singular  value 
of  JTJ  exceeds  the  largest  singular  value  of  B,  but  may  otherwise  diverge.  It  is  not  convergent 
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when  the  minimum  singular  value  of  B  exceeds  the  maximum  singular  value  of  JT7  in  a  neigh¬ 
borhood  of  a  solution.  For  more  detailed  convergence  analysis  see,  for  example,  Osborne  [1972], 
McKeown  [1975a,  1975b],  Ramsin  and  Wedin  [1977],  Deuflhard  and  Apostolescu  [1980],  Dennis 
and  Schnabel  [1983,  Chapter  10],  Schaback  [1985],  and  Haussler  [1986]. 

A  drawback  of  the  Gauss- New  ton  method  is  that  when  f1  J  is  singular,  or,  equivalently,  when 
J  has  linearly  dependent  columns,  (2.1)  does  not  have  a  unique  minimizer.  For  this  reason  the 
Gauss- Newton  method  should  more  accurately  be  viewed  as  a  class  of  methods,  each  member 
being  distinguished  by  a  different  choice  of  p  when  JTJ  is  singular.  The  set  of  vectors  that 
minimize  (2.1)  is  the  same  as  the  set  of  solutions  to  the  linear  least-squares  problem 

mhi  ||Jp-l- /||2  •  (2.3) 

p€#* 

One  (theoretically)  well-defined  alternative  that  is  often  approximated  computationally  is  to  re¬ 
quire  the  unique  solution  of  minimum  / 2  norm : 

min|lpi|2,  (2.4) 

p€« 5 

where  S  is  the  set  of  solutions  to  (2.3),  while  another  is  to  replace  J  in  (2.3)  by  a  maximal  linearly 
independent  subset  of  its  columns  (see,  for  example,  the  references  cited  above  on  linear  least 
squares).  In  finite-precision  arithmetic,  there  is  often  some  ambiguity  about  how  to  formulate  and 
solve  these  alternative  subproblems  when  the  columns  of  J  are  "nearly”  linearly  dependent,  so 
that,  from  a  computational  standpoint,  any  particular  Gauss- Newton  method  must  be  still  viewed 
as  a  class  of  methods.  The  references  cited  above  for  linear  least  squares  discuss  at  length  the 
difficulties  inherent  in  computing  solutions  to  (2.3)  when  J  is  ill-conditioned,  and  show  that  the 
numerical  solution  of  these  problems  is  dependent  on  the  criteria  used  to  estimate  the  rank  of 
J.  From  now  on,  the  term  “Gauss-Newton  method”  will  refer  to  any  linesearch  method  in  which 
the  search  direction  is  the  result  of  any  well-defined  computational  procedure  for  solving  (2.3). 

For  the  moment,  assume  that  a  solution  p  to  (2.3)  can  be  computed.  Then  because  p 
satisfies  (2.2),  p  is  a  direction  of  descent  for  f*  f  whenever  f1  f  ^  0  (in  other  words,  gTp  <  0, 
so  that  / T/  initially  decreases  along  p).  To  guarantee  convergence,  the  search  direction  must 
also  be  bounded  away  from  orthogonality  to  the  gradient,  a  condition  that  may  not  be  met  by 
a  Gauss-Newton  method  unless  the  eigenvalues  of  JTJ  are  bounded  away  from  zero  for  the 
sequence  of  iterates.  Powell  [1970]  gives  an  example  of  convergence  of  a  Gauss- Newton  method 


with  exact  line  search  to  a  non- stationary  point.  Moreover,  when  JTJ  is  nearly  singular,  the 
(unique)  solution  to  (2.2)  can  be  very  large  in  magnitude  compared  to  ||J||2  and  ||/|j2. 

Bounding  the  norm  of  the  solution  is  a  major  concern  in  formulating  criteria  for  rank  esti¬ 
mation  and  solution  of  linear  least-squares  problems  in  finite-precision  arithmetic,  largely  because 
numerical  solutions  to  (2.3)  may  not  be  very  accurate  when  the  columns  of  J  are  nearly  linearly 
dependent  (see  the  references  cited  above  on  linear  least  squares).  In  the  context  of  nonlinear 
least  squares,  another  reason  to  avoid  large  search  directions  is  that  numerical  linesearch  methods 
may  not  be  able  to  determine  an  adequate  step  length  when  ||p||2  is  large.  Moreover,  the  angle 
between  p  and  g  may  be  taken  into  account  in  estimating  the  rank  of  J,  since  p  must  be  a 
descent  direction  for  f*  f  that  is  bounded  away  from  orthogonality  to  the  gradient.  We  shall  see 
in  Section  5  that,  even  with  these  additional  considerations  that  can  be  brought  to  bear  on  (2.3) 
due  to  the  outer  linesearch  algorithm,  it  may  be  very  difficult  to  give  a  numerical  definition  of 
rank. 

The  performance  of  Gauss- Newton  methods  is  not  fully  understood.  Gauss- Newton  meth¬ 
ods  are  of  practical  interest  because  there  are  many  instances  in  which  they  work  very  well  in 
comparison  to  other  methods,  and  in  fact  most  successful  specialized  approaches  to  nonlinear 
least-squares  problems  are  based  to  some  extent  on  Gauss- Newton  methods  and  attempt  to  ex¬ 
ploit  this  behavior  whenever  possible  (for  a  survey,  see  Fraley  [1987]).  However,  it  is  not  hard 
to  find  cases  where  Gauss- Newton  methods  perform  poorly,  so  that  they  cannot  be  successfully 
applied  without  modification  to  general  nonlinear  least-squares  problems.  These  remarks  will  be 
substantiated  by  examples  in  Sections  5  and  6. 

Perhaps  a  reason  for  the  variability  in  the  performance  of  Gauss- Newton  methods  is  that 
they  are  not  theoretically  well-defined.  To  see  this,  let  <J(x)  be  a  k  X  m  orthogonal  matrix 
function  on  R",  that  is,  Q(x)TQ(x)  =  I  for  all  x.  Then  \\Q(x)f(x)\\]  =  ||/(x)||2  for  all  x,  and 
consequently  the  function  /  =  Qf  defines  the  same  nonlinear  least-squares  problem  as  /.  The 
Jacobian  matrix  of  /  is  j  s  QJ  +  (VQ)/,  so  that  a  minimizer  of  ||  Jp+  f  ||j  will  ordinarily  be 
different  from  a  minimizer  of  |j  Jp  +  /||2,  unless  <?(x)  happens  to  be  a  constant  transformation. 
However,  if  both  Q  and  /  have  k  continuous  derivatives,  then  V'  ||Q(x)/(x)||2  =  V'  ||/(x)||2 
for  i  =  1,2, Letting  W  s  (VQ)/,  so  that  J  =  QJ  +  W,  we  have 

PS  =  JTJ  +  ( JrQJW  +  WrQJ)  +  WTW, 
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showing  that  the  Gauss-Newton  approximation  JTJ  to  the  full  Hessian  matrix  is  changed  when 
/  is  transformed  by  an  orthogonal  function  that  varies  with  x.  Thus,  with  exact  arithmetic, 
there  are  many  Gauss-Newton  methods  corresponding  to  a  given  vector  function  (in  fact,  each 
step  of  a  Gauss-Newton  method  could  be  defined  by  a  different  transformation  of  /),  although 
Newton's  method  remains  invariant  (see  also  Nocedal  and  Overton  [1985],  p.  826).  Moreover, 
the  conditioning  of  J  may  be  very  different  from  that  of  J,  so  that,  for  example,  the  columns  of 
j  might  be  strongly  independent,  while  J  is  nearly  rank  deficient.  Since  k  may  be  greater  than 
n,  it  is  possible  to  imbed  the  given  nonlinear  least-squares  problem  in  a  larger  one.  To  the  best 
of  our  knowledge  the  idea  of  preconditioning  a  Gauss- Newton  method  by  an  orthogonal  function 
at  each  step  has  never  been  explored,  although  some  work  has  been  done  on  conjugate-gradient 
acceleration  for  Gauss- Newton  methods  in  the  full-rank  case  (see  Ruhe  [1979]  and  Al-Baali  and 
Fletcher  [1985]). 

3.  Studies  of  Gauss-Newton  Methods 

Our  main  concern  in  this  section  is  with  research  that  specifically  addresses  computational 
aspects  of  Gauss-Newton  methods.  Comparisons  are  most  often  made  to  Levenberg-Marquardt 
methods  for  nonlinear  least  squares  (see,  for  example,  Mord  [1978]),  and  to  quasi-Newton  meth¬ 
ods  for  unconstrained  optimization  (see,  for  example,  Dennis  and  Mord  [1977],  or  any  of  the 
references  cited  above  for  unconstrained  optimization).  For  a  survey  of  some  of  the  early  (mostly 
theoretical)  research  on  Gauss-Newton  methods,  see  Dennis  [1977], 

Bard  [1970]  compares  some  Gauss- Newton-based  methods  with  a  Levenberg-Marquardt 
method  and  some  quasi- Newton  methods  for  unconstrained  optimization  on  a  set  of  ten  test 
problems  from  nonlinear  parameter  estimation.  His  results  are  not  directly  comparable  to  the 
Gauss- Newton  methods  described  in  this  paper,  because  he  uses  the  eigenvalue  decomposition 
of  JTJ  in  order  to  solve  the  normal  equations  (2.2),  and  modifies  the  eigenvalues  if  their  mag¬ 
nitude  falls  below  a  certain  threshold  in  order  to  ensure  a  positive-definite  system.  In  addition, 
his  implementations  include  bounds  on  the  variables  that  are  enforced  by  adding  a  penalty  term 
to  the  objective  function.  He  finds  that  the  Gauss- Newton-based  methods  are  more  efficient  in 
terms  of  function  and  derivative  evaluations  than  the  quasi-Newton  methods,  but  that  there  is 
no  significant  difference  in  the  relative  performance  of  the  Gauss- Newton-based  methods  and  the 
Levenberg-Marquardt  method. 
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McKeown  (1975a,  1975b]  studies  test  problems  of  the  form, 

/  xtH\X 

f(x)  =  fo  +  Gox  +  i 

chosen  so  that  factors  affecting  the  rate  of  convergence  could  be  controlled.  He  uses  three 
such  problems,  each  with  seven  different  values  of  a  parameter  that  varies  an  asymptotic  linear 
convergence  factor.  The  algorithms  tested  include  some  quasi* Newton  methods  for  unconstrained 
optimization,  as  well  as  some  specialized  methods  for  nonlinear  least  squares  that  have  since 
been  superseded.  He  concludes  that,  when  the  asympotic  convergent  factor  is  small,  the  Gauss- 
Newton  method  is  more  efficient  than  the  quasi-Newton  methods  but  that  the  opposite  is  true 
when  the  asympotic  convergence  factor  is  large.  No  mention  is  made  of  strategies  to  deal  with 
rank-deficient  Jacobians  in  the  Gauss-Newton  method,  so  that  presumably  this  situation  is  never 
encountered  in  his  experiments.  We  included  these  problems  in  our  numerical  tests  (see  the  results 
for  problems  39  -  41  in  the  Appendix)  and  found  that  the  Jacobian  matrix  was  well-conditioned 
at  each  iteration  in  every  case. 

Ramsin  and  Wedin  [1977]  compare  the  performance  of  a  Gauss- Newton-based  method  with 
that  of  a  Levenberg-Marquardt  method  for  nonlinear  least  squares  and  a  quasi- Newton  method  for 
unconstrained  optimization,  both  from  the  Harwell  Library.  The  quasi- Newton  routine  required 
an  initial  estimate  H0  of  the  Hessian  matrix,  and  the  choice  H0  =  J(xo)T  J(xo)  was  made  on 
the  basis  of  preliminary  tests  that  showed  equal  or  better  performance  compared  to  J70  =  7 
The  test  problems  were  constructed  so  that  asymptotic  properties  could  be  monitored  and  are 
similar  to  those  of  McKeown  (1975a,  1975b]  mentioned  above.  In  all  cases  considered,  the 
Jacobian  matrix  had  full  column  rank  at  the  solution.  The  algorithm  of  Ramsin  and  Wedin  uses 
the  steepest-descent  direction,  rather  than  the  Gauss-Newton  direction,  whenever  the  decrease 
in  the  objective  is  considered  unacceptably  small.  The  experiments  involved  variation  of  a  large 
number  of  parameters.  Ramsin  and  Wedin  conclude  that  their  Gauss-Newton-based  method  and 
the  Levenberg-Marquardt  method  are  identical  when  the  asymptotic  convergence  factor  is  small, 
but  that  the  results  do  not  show  that  either  method  is  consistently  better  for  large  asymptotic 
convergence  factors.  Also,  they  find  that  in  instances  when  the  asymptotic  convergence  factor 
is  large,  the  quasi-Newton  method  may  be  more  efficient,  although  superlinear  convergence  of 
the  quasi-Newton  method  was  never  observed.  Ramsin  and  Wedin  maintain  that  Gauss-Newton 
should  not  be  used  when  («)  the  current  iterate  x*  is  close  to  the  solution  x*,  and  the  relative 
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decrease  in  the  size  of  the  gradient  is  small,  when  ( ii )  x k  is  not  near  x’ ,  and  the  decrease  in 
the  sum  of  squares  relative  to  the  size  of  the  gradient  is  small,  or  when  (in)  Jk  is  nearly  rank- 
deficient.  Conditions  (i)  and  (ti)  are  indicators  of  inefficiency  for  any  minimization  algorithm, 
in  general  the  problem  of  ascertaining  the  closeness  of  an  iterate  to  a  minimum  is  as  difficult  as 
solving  the  original  problem.  As  for  condition  (m),  we  show  in  Section  5  that  rapidly  convergent 
Gauss-Newton  methods  may  exist  even  if  nearly  rank-deficient  Jacobians  are  encountered,  but 
that  it  appears  that  different  rules  for  estimating  the  rank  of  the  Jacobian  must  be  applied  to 
different  types  of  nonlinear  least-squares  problems  in  order  to  obtain  this  favorable  behavior 

Deuflhard  and  Apostolescu  [1980]  suggest  selecting  a  step  length  for  the  Gauss- Newton 
direction  based  on  decreasing  the  merit  function  |j Jlf{x)\\2  rather  than  ||/(*)||2,  ^or  a  c'ass 
of  nonlinear  least-squares  problems  that  icludes  zero-residual  problems.  The  function  Jl  is 
the  pseudo-inverse  of  Jk  (see,  for  example,  Golub  and  Van  Loan  [1983],  Chapter  6);  Jlh  is 
another  way  of  representing  the  minimum  /2*norm  solution  to  \\Jkp  +  /*|j2.  They  reason  that 
the  Gauss-Newton  direction  is  the  steepest-descent  direction  for  the  function  ||Jr^/(i)||j,  so  that 
the  geometry  of  the  level  surfaces  defined  by  ||J^/(x)||2  is  more  favorable  to  avoiding  small  steps 
in  the  linesearch.  A  shortcoming  of  this  approach  is  that  is  that  there  are  no  global  convergence 
results.  The  merit  function  depends  on  xk,  so  that  a  different  function  is  being  reduced  at  each 
step.  Another  difficulty  is  that,  although  the  authors  state  that  numerical  experience  supports 
selection  of  a  step  length  based  on  ||.7*/(:e)||2  for  ill-conditioned  problems,  the  transformation  J* 
is  not  numerically  well-defined  under  these  circumstances.  Therefore  neither  the  Gauss- Newton 
search  direction,  nor  the  merit  function,  is  numerically  well-defined  when  the  columns  of  Jk  are 
nearly  linearly  dependent. 

Wedin  and  Lindstrom  [1987]  present  a  hybrid  algorithm  for  nonlinear  least-squares  that 
combines  a  Gauss-Newton  method  with  a  finite-difference  Newton  method.  The  Gauss-Newton 
method  is  implemented  with  a  QR  factorization  and  a  scheme  for  rank  estimation  that  depends 
on  information  from  the  previous  iteration,  as  well  as  on  a  user-supplied  tolerance.  They  give 
numerical  results  for  a  set  of  thirty  large-residual  test  problems  constructed  by  Al-Baali  and 
Fletcher  [1985],  and  compare  their  results  with  those  given  by  Al-Baali  and  Fletcher  for  two 
hybrid  Gauss- Newton/BFGS  methods  and  a  version  of  ML2S0L.  Wedin  and  Lindstrom  find  that 
their  method  gives  better  overall  results  than  the  other  methods,  although  their  method  does  fail 
in  three  cases  due  to  a  finite-difference  Hessian  that  is  not  positive  definite 
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Fraley  [1987]  gives  numerical  results  for  a  large  set  of  test  problems  using  widely-distributed 
software  for  unconstrained  optimization  and  nonlinear  least  squares.  She  also  includes  some 
Gauss-Newton  methods  that  use  LSSOL  [Gill  et  al.  (1986a)]  to  solve  the  linear  least-squares 
subproblem  (these  results  are  reproduced  in  an  appendix  to  this  paper).  Her  findings  confirm 
that  Gauss- Newton  methods  are  often  among  the  best  available  techniques  for  nonlinear  least 
squares  —  especially  zero-residual  problems  —  but  that  there  are  many  cases  in  which  they  fail 
or  are  inefficient.  However,  no  general  a  priori  characterization  is  given  of  those  problems  on 
which  Gauss-Newton  will  work  well;  the  present  paper  gives  some  insight  into  why  it  is  difficult 
to  do  so. 

4.  Description  of  Numerical  Results 

This  section  gives  general  information  on  the  numerical  results  that  are  presented  in  the 
remainder  of  this  paper. 

In  the  examples  of  Section  5,  the  UNPACK  routine  DSVDC  [Dongarra  et  al.  (1979)]  is 
used  to  compute  the  singular-value  decomposition  (SVD)  of  the  Jacobian  at  each  iteration  ;  the 
linear  least-squares  subproblems  within  the  Gauss-Newton  methods  are  then  solved  via  the  SVD 
A  detailed  description  of  the  solution  procedure  for  the  subproblems  is  given  in  that  section 
The  same  procedure  is  also  used  for  the  Gauss-Newton  example  in  Section  6,  although  rank 
estimation  is  not  an  issue  there  because  the  Jacobian  is  well-conditioned.  The  linesearch  for  the 
Gauss-Newton  examples  in  Sections  5,  as  well  as  for  all  of  the  numerical  results  in  Section  6,  is 
taken  from  the  nonlinear  programming  package  HPSOL  [Gill  et  al.  (1979),  (1986b)],  and  requires 
both  function  and  gradient  evaluations. 

The  Gauss-Newton  methods  in  Section  5  are  compared  to  numerical  results  for  some  uncon¬ 
strained  optimization  methods  using  the  following  widely-distributed  software : 


program 


method 

derivative 

global 

source 

information 

strategy 

MNA/E04LBF 

modified  Newton 

second 

MPSOL 

quasi-Newton  (BFGS) 

first 

DMNH/HUMSL 

modified  Newton 

second 

DMNG/SUMSL 

quasi-Newton  (BFGS) 

first 

linesearch  NPL/NAG 
linesearch  SOL/NAG 
trust  region  PORT/ACM 
trust  region  PORT/ ACM 
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These  programs  come  from  the  following  software  sources : 


NAG  -  Numerical  Algorithms  Group,  Inc. 

NPL  -  National  Physical  Laboratory,  England 

PORT  -  PORT  Mathematical  Software  Library,  A.  T.  &  T.  Bell  Laboratories,  Inc. 

ACM  -  Association  for  Computing  Machinery 

SOL  •  Systems  Optimization  Laboratory,  Stanford  University 


The  following  keywords,  listed  under  the  label  'conv.'  in  the  tables,  are  used  to  describe 
abnormal  termination  conditions : 

f  lim  -  function  evaluation  limit  reached 
loop  -  subroutine  appears  to  loop 
time  -  time  limit  exceeded 

In  the  tables,  under  the  label  ‘est.  eiT.’,  we  include  the  quantity 

wnl  -  \\h'.t\\l 

1  +  IIA«r«ll2 

where  /*  is  the  value  of  /  at  the  point  of  termination,  and  ||/6e,t||2  is  the  best  available  estimate 
of  the  norm  of  the  solution,  in  order  to  get  some  idea  of  the  error  in  )|/*]|2.  For  those  problems 
that  have  nonzero  residuals,  the  value  of  ||/be«t!l2  >*  given  to  six  figures  of  accuracy,  rounded 
down. 

We  use  the  notation  rank(J)  for  numerical  definitions  of  the  rank  of  the  Jacobian,  and 
cond(J)  for  the  condition  number  of  the  Jacobian  (the  ratio  of  the  largest  singular  value  to  the 
smallest  singular  value  —  see,  for  example,  Golub  and  Van  Loan  [1983]). 

Two  sets  of  data  are  given  for  each  routine  on  each  example,  corresponding  to  two  different 
sets  of  values  for  parameters  in  the  termination  criteria.  This  data  is  taken  from  Chapter  2  of 
Fraley  [1987],  which  contains  detailed  information  about  the  choices  made  for  the  parameter 
values. 

All  of  the  programs  were  run  in  FORTRAN  using  double  precision  on  the  IBM  3081  and  IBM 
3033  computers  at  Stanford  Linear  Accelerator  Center,  for  which 

relative  machine  precision  c*  =  2.22 ...  X  10-16  ;  y/TZ  —  1.49 . . .  x  10-8. 
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5.  Performance  on  Problems 

with  Ill-Conditioned  Jacobians 

We  explained  in  Section  2  that  the  Gauss-Newton  framework  defines  a  class  of  methods, 
whose  members  are  distinguished  by  the  numerical  algorithm  for  solving  the  linear  least-squares 
subproblem  (2-3)  for  a  search  direction,  as  well  as  by  the  linesearch  method  that  is  subsequently 
used  to  find  a  steplength  along  that  direction.  This  section  is  concerned  with  the  variablity  in 
Gauss- Newton  algorithms  that  is  due  to  computational  procedures  for  the  linear  least-squares 
subproblem.  The  most  stable  techniques  for  solving  ill-conditioned  linear  least-squares  problems 
involve  orthogonal  factorizations :  the  singular-value  decomposition  (SVD)  and  the  QR  factor¬ 
ization  (see  the  references  cited  in  Section  1  on  linear  least  squares).  The  linear  least-squares 
subproblems  within  our  Gauss-Newton  examples  are  solved  by  means  of  the  SVD.  Results  are 
not  given  for  Gauss- Newton  methods  that  use  the  QR  factorization,  because  the  same  basic 
considerations  apply  in  choosing  the  search  direction,  and  also  because  in  practice  the  behavior 
is  similar  to  that  observed  for  the  SVD. 


5.1.  SVD  Solution  to  Linear  Least-Squares  Subproblems 


Given  the  singular-value  decomposition  of  the  Jacobian 


J  = 


U(S  0)VT,  if  m  <  n; 
USVr,  if  m  =  n; 


if  m  >  n; 


(5.1) 


where  S  is  diagonal  with  non-negative  diagonal  entries  Oj  >  a  j  >  . . .  >  and  U  and 

V  are  orthogonal,  define 

rmu  s  max  {  » |  <x,  ^  0  }. 


Let 

% 

Pi  ~  YlW' 
j*  1 

where  u j,Vj  are  the  jth  columns  of  V 
basis  for  ffn,  and  Tj,  j  =  1,2,..., », 


and  V,  respectively.  The  columns  of  V  form  an  orthonormal 
are  the  components  of  p,  in  terms  of  this  basis,  with 


Ma  =  ^2r]- 

j= i 
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When  t  <  min{m,  n},  has  no  component  in  the  space  spanned  by  t><+2, . . . ,  rmin^m  nj}. 

In  exact  arithmetic,  each  p<  is  either  orthogonal  to  the  gradient  g  =  JT  f  of  the  nonlinear  least- 
squares  objective,  or  it  is  a  descent  direction  (see,  for  example,  Chapter  4  of  Fraley  [1987]). 

In  practice  the  SVD  cannot  be  computed  exactly,  and  the  solution  to  the  linear  least-squares 
subproblem  (2.3)  is  taken  to  be  pr,  where  r  <  rm4X  is  an  estimate  of  the  rank  of  J.  In  the 
examples  below,  the  numerical  rank  of  the  Jacobian  is  defined  to  be 

rank(J)  &  max  {  1 1  <r,  >  «(1  +  <ri)  }.  (5.3) 

This  criterion  depends  only  on  J  and  does  not  take  into  account  the  size  of  the  search  direction 
p,  the  angle  between  p  and  the  gradient,  or  the  vector  /.  Some  specific  examples  will  now  be 
given  that  show  some  of  the  difficulties  involved  in  defining  rank(J)  for  Gauss-Newton  methods. 

5.2.  Chebyquad  n  =  m  —  8  (#  35a.) 

The  first  example  is  related  to  the  problem  of  locating  nodes  for  Chebyschev  quadrature 
[Fletcher  (1965);  More,  Garbow,  and  Hillstrom  (1981)],  and  demonstrates  that  the  choice  of  £ 
in  (5.3)  can  be  critical. 


Gauss- Newton 

f  =  10-14 

t  <  10- 

/,  J  evals. 

147 

124 

iters. 

44 

19 

Ik'llj 

1.65 

1.63 

ll/llz 

10"2 

10-1 

lirii. 

10_n 

10'1 

est.  err. 

10"9 

10'2 

The  algorithm  succeeds  in  finding  an  approximate  minimum  when  t  =  10~14,  although  it  fails 
when  (  =  10"1$.  The  problem  is  rather  easily  solved  by  the  unconstrained  methods,  as  shown  in 
the  table  below. 
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1 


MNA 


DMNH 


KPSOL 


DMNG 


f  evals. 

46 

46 

14 

J  evals. 

46 

46 

11 

iters. 

15 

15 

11 

Mj 

1.65 

1.65 

1.65 

11/11, 

10"1 

10-1 

10" 1 

urn. 

IO-10 

10-1° 

♦-» 

O 

1 

<0 

est.  err. 

10“9 

01 

1 

O 

IQ’9 

14 

33 

35 

34 

38 

11 

33 

35 

24 

28 

11 

19 

21 

24 

27 

1.65 

1.65 

1.65 

1.65 

1.65 

lO'1 

10'1 

lO'1 

10"1 

10-1 

10-9 

10'5 

lO'7 

10_i 

10~8 

lO"9 

lO"9 

Ok 

1 

O 

H 

10‘9 

10“8 

The  next  two  tables  trace  the  progress  of  the  Gauss- Newton  methods  for 
10-“  respectively. 


(  —  10-M  and  ( 


Gauss-Newton  on  Problem  35a 


k 

f,J 

evals. 

Mil, 

IIAII, 

119*11, 

0 

8 

2 . £+00 

2.8-01 

8.E-01 

1 

16 

2 .  E+00 

2.E-01 

7.E-01 

2 

24 

2.E+00 

2.E-01 

7.E-01 

3 

32 

2 .  E+00 

2.E-01 

6.E-01 

4 

35 

2. E+00 

2.8-01 

S.E-01 

5 

37 

2.  E+00 

1.8-01 

3.8-01 

6 

41 

2.  E+00 

1.8-01 

2.8-01 

7 

47 

2. E+00 

1.8-01 

2.8-01 

8 

54 

2.  E+00 

1.8-01 

2.8-01 

9 

62 

2. E+00 

1.8-01 

2.E-01 

10 

69 

2. E+00 

1.8-01 

2.E-01 

11 

76 

2. E+00 

1.8-01 

2.E-01 

12 

83 

2.  E+00 

1.8-01 

2.E-01 

13 

90 

2. E+00 

l.E-01 

2.E-01 

14 

97 

2. E+00 

l.E-01 

2.E-01 

15 

104 

2. E+00 

l.E-01 

2.E-01 

16 

111 

2. E+00 

l.E-01 

2.E-01 

17 

118 

2. E+00 

l.E-01 

2.E-01 

18 

120 

2. E+00 

1.8-01 

2.8-01 

19 

123 

2. E+OO 

8.8-02 

2.8-01 

20 

124 

2. E+00 

6.8-02 

7.B-02 

21 

125 

2. E+00 

6.8-02 

3.E-02 

22 

126 

2. E+00 

8.E-02 

l.E-02 

23 

127 

2 . E+00 

6 . 8-02 

6.E-03 

24 

128 

2. E+00 

6.E-02 

2.E-03 

25 

129 

2.B+00 

6.E-02 

8.E-04 

26 

130 

2. E+00 

6.8-02 

3.E-04 

27 

131 

2. E+00 

8.E-02 

l.E-04 

28 

132 

2. E+00 

6.E-02 

6.E-06 

29 

133 

2. E+00 

6.E-02 

2.E-05 

30 

134 

2. E+00 

6.8-02 

8.E-06 

31 

135 

2. E+00 

6.8-02 

3.E-06 

32 

136 

2. E+00 

6.8-02 

l.E-06 

33 

137 

2. E+00 

6.8-02 

8.8-07 

34 

138 

2. E+00 

6.8-02 

2.8-07 

35 

139 

2. E+00 

6.8-02 

8.8-08 

36 

140 

2.B+00 

6.8-02 

3.8-08 

37 

141 

2 . E+00 

6.8-02 

1.8-08 

38 

142 

2.  B+OO 

6.8-02 

6.B-08 

39 

143 

2 . 8+00 

6.8-02 

2.8-09 

40 

144 

2.8+00 

6.8-02 

7.8-10 

41 

145 

2.8+00 

6.8-02 

3.8-10 

42 

146 

2.8+00 

6.8-02 

1.8-10 

43 

147 

2.8+00 

2.8+00 

6.8-02 

8.E-02 

4.8-11 

2.E-11 

10"14 


Ml, 

SjPk 

cond 

rank 

Jk 

Jk 

2.8+00 

-4.E-02 

7.3E-02 

102 

8 

3. E+00 

-3.E-02 

1.6E-02 

102 

8 

2. E+00 

-3.E-02 

1.6E-02 

102 

8 

4. E+00 

-3.8-02 

3.6E-02 

102 

8 

7.8-01 

-3.E-02 

3.1E-01 

102 

8 

2.8-01 

-1.8-02 

2.28-01 

101 

8 

6.E-01 

-l.E-02 

1.6E-02 

102 

8 

1.8+01 

-9.8-03 

6.0E-06 

103 

8 

1.8+02 

-9.8-03 

4.9E-07 

104 

8 

1.8+03 

-9.8-03 

4.8E-09 

10s 

8 

1.8+04 

-9.8-03 

6.1E-11 

10® 

8 

1 .8+08 

-9.8-03 

6.1E-13 

107 

8 

1.8+06 

-9 . 8-03 

6.0E-16 

10® 

8 

l.E+07 

-9.E-03 

4.9E-17 

10® 

8 

l.E+08 

-9.E-03 

4.9E-19 

1010 

8 

1 . E+09 

-9.E-03 

4.7E-21 

1011 

8 

1 .8+10 

-9.8-03 

4.7E-23 

1012 

8 

l.B+11 

-9.E-03 

4.7E-26 

1013 

8 

8.B-02 

-4.E-03 

6.7E-01 

1014 

7 

8.B-03 

-8.E-04 

2.1E+00 

1014 

7 

9.8-03 

-6.8-04 

l.OE+OO 

1014 

7 

2.8-03 

-6.8-06 

l.OE+OO 

1014 

7 

1 . 8-03 

-l.E-06 

l.OE+OO 

1014 

7 

4.E-04 

-2.8-06 

l.OE+OO 

1014 

7 

2.8-04 

-3.8-07 

l.OE+OO 

1014 

7 

6.8-06 

-4.E-08 

l.OE+OO 

1014 

7 

3 . E-OS 

-2.8-09 

l.OE+OO 

1014 

7 

1.8-06 

-1.8-09 

l.OE+OO 

1014 

7 

4.8-06 

-2.E-10 

1 .  OE+OO 

1014 

7 

2.E-06 

-2.8-11 

1 . OE+OO 

1014 

7 

6.E-07 

-4.E-12 

l.OE+OO 

1014 

7 

2.E-07 

-6.8-13 

l.OE+OO 

1014 

7 

9.8-08 

-9.8-14 

1 .  OE+OO 

1014 

7 

4.8-08 

-1.8-14 

l.OE+OO 

1014 

7 

1.8-08 

-2.8-16 

l.OE+OO 

1014 

7 

6.8-09 

-3.E-16 

l.OE+OO 

1014 

7 

2.8-09 

-6.8-17 

l.OE+OO 

1014 

7 

9.8-10 

-8.8-18 

l.OE+OO 

1014 

7 

4.8-10 

-liB-lB 

1.  OE+OO 

1014 

7 

1.8-10 

-2.8-19 

1.08+00 

1014 

7 

6.8-11 

-3.8-20 

l.OE+OO 

1014 

7 

2.8-11 

-6.8-21 

1.08+00 

1014 

7 

8.8-12 

-7.8-22 

l.OE+OO 

1014 

7 

3.E-12 

-1.8-22 

l.OE+OO 

1014 

7 

14 


Gauss-Newton  on  Problem  35a. 


(  =  10"15 


k 

f,J 

evals 

ll/fella 

llftlla 

llftlla 

fkPk 

Ctk 

cond 

Jk 

rank 

Jk 

0 

8 

2 .  E+00 

2.E-01 

8.E-01 

2. E+00 

-4.B-02 

7.3E-02 

102 

8 

1 

16 

2 .  E+00 

2.E-01 

7.E-01 

3.8+00 

-3.E-02 

1  SE-02 

102 

8 

2 

24 

2.  E+00 

2.E-01 

7.B-01 

2 . E+00 

-3.E-02 

1.6E-02 

102 

8 

3 

32 

2.  E+00 

2.B-01 

8 .8-01 

4. E+00 

-3.B-02 

3.  SE-02 

102 

8 

4 

35 

2 .  E+00 

2.B-01 

6.E-01 

7.8-01 

-3.8-02 

3.1E-01 

102 

8 

5 

37 

2.  E+00 

l.E-01 

3.B-01 

2.B-01 

-l.E-02 

2.2E-01 

101 

8 

6 

41 

2 .  E+00 

1.8-01 

2.B-01 

e.B-oi 

-l.B-02 

1.8E-02 

102 

8 

7 

47 

2.  E+00 

l.E-01 

2.E-01 

l.E+01 

-9.E-03 

5 . 0E-0S 

103 

8 

8 

54 

2 .  E+00 

l.B-01 

2.B-01 

1.8+02 

-8.E-03 

4.9E-07 

104 

8 

9 

62 

2.  E+00 

l.E-01 

2.E-01 

1 .8+03 

-9.8-03 

4.8E-09 

10s 

8 

10 

69 

2. E+00 

l.E-01 

2.E-01 

l.B+04 

-9.8-03 

S.1E-11 

10® 

8 

11 

76 

2 .  E+00 

l.E-01 

2.B-01 

1 . E+06 

-9.8-03 

B.1E-13 

107 

8 

12 

83 

2.  E+00 

l.B-01 

2.B-01 

l.E+06 

-9.E-03 

6.0E-15 

10* 

8 

13 

90 

2.  E+OO 

l.E-01 

2.E-01 

l.E+07 

-9.B-03 

4.9E-17 

109 

8 

14 

97 

2. E+00 

l.E-01 

2.B-01 

l.B+08 

-9.E-03 

4.9E-19 

1010 

8 

15 

104 

2.  E+00 

l.E-01 

2.E-01 

1 .E+00 

-9.E-03 

4.7E-21 

10" 

8 

16 

111 

2.  E+00 

l.E-01 

2.E-01 

l.B+10 

-9.E-03 

4.7E-23 

1012 

8 

17 

118 

2 .  E+00 

l.E-01 

2.E-01 

l.E+11 

-9.E-03 

4.7E-25 

1013 

8 

18 

124 

2. E+00 

l.E-01 

2.E-01 

l.E+12 

-9.E-03 

0.0E-01 

1014 

8 

Until  iteration  18,  the  Jacobian  has  full  column  rank  at  each  step  according  to  (5.3),  and 
it  becomes  increasingly  ill-conditioned  as  the  computation  proceeds.  The  search  direction  grows 
very  large  and  approaches  orthogonality  to  the  gradient,  while  the  step  length  decreases.  No 
significant  decrease  is  observed  in  either  ||/||2  or  ||j||2  in  iterations  6  -  17.  At  iteration  18,  the 
two  Gauss-Newton  methods  differ.  For  c  =  10— 1-4 ,  the  estimated  rank  of  the  Jacobian  is  reduced 
to  7,  and  a  significant  decrease  in  the  function  is  achieved.  For  (  <  10"15,  by  (5.3)  the  Jacobian 
still  has  full  column  rank,  and  the  algorithm  terminates  because  a*p*  •*  judged  to  be  negligible 
relative  to  ||x*||2.  Detailed  information  at  the  start  of  iteration  18  for  the  Gauss-Newton  methods 
is  given  in  the  next  table. 


£  <  10-H; 

iteration  18 

r 

Or 

krl 

llPrlb 

\fPr\ 

\cos(g,pr)\ 

1 

101 

10"3 

10“3 

\ 

o 

H 

10° 

2 

101 

10-16 

10~3 

10"4 

10° 

3 

10° 

10-16 

10“3 

10-4 

10° 

4 

10° 

10-2 

10"2 

10"3 

10° 

5 

10° 

10-is 

►-» 

© 

1 

M 

10"3 

10° 

6 

10° 

10_1 

10"1 

10-3 

10*1 

7 

10° 

10~14 

10"1 

10“3 

10"1 

8 

lO-u 

1012 

1012 

10-2 

10-13 

It  seems  reasonable  to  say  that  rank(J)  =  7  rather  than  rank(J)  -  8  at  this  point,  because 
<  Oi,  IIpsII,  >  ||P7||2,  and  \cos(g,pt)\  <  \cos{g,p7)\.  Hence  it  is  not  surprising  that  it  is 
the  method  with  e  =  10-14,  rather  than  the  one  with  £  =  10“15,  that  ultimately  makes  good 
progress  toward  the  solution. 

The  behavior  of  the  Gauss- Newton  methods  can  be  explained  by  comparing  the  sequence 
{pj}  of  steps  from  the  iterates  to  the  minimum  of  the  function,  to  the  sequence  {p*}  of  Gauss- 
Newton  steps.  The  magnitudes  of  the  components  of  these  vectors  in  terms  of  the  basis  {t>;(x*)}, 
for  iterations  6  •  18,  are  listed  in  the  tables  below. 


components  { 

r/(*fc)}  ofP;  =  x* 

-  X*  in 

terms  of  {t>j 

(x*)} 

k 

k;  1 

kjl 

k3*l 

k;i 

kfl 

kel 

krl 

ks*l 

6 

1(T2 

10"9 

10-8 

10*2 

10"9 

10"2 

10-9 

10-3 

7 

10~2 

10-9 

10~8 

10“2 

10-9 

10"2 

10~9 

10-4 

8 

1 

O 

H 

10-9 

10“8 

10-2 

10"9 

1 

0 

iH 

10-9 

10_s 

9 

10"2 

io*9 

10~8 

10-2 

10"9 

10"2 

10-9 

10-8 

10 

10-2 

10-9 

10-8 

10“2 

10-9 

10'2 

10~9 

10-7 

11 

10-2 

10-9 

io~8 

10“2 

10-9 

lO"2 

10-9 

10"8 

12 

10“2 

10'9 

10~8 

10“2 

10"9 

10“2 

10-9 

10-9 

13 

10~2 

10  "9 

10“8 

10-2 

10"9 

10'2 

10-9 

10-1° 

14 

10~2 

10-9 

10-8 

10-2 

10-9 

1 

O 

H 

10-9 

10-” 

15 

10“2 

10-9 

10“8 

10-2 

10-9 

10~2 

10-9 

10-12 

16 

10“2 

10-9 

10-8 

10-2 

10'9 

10"2 

10~9 

10-13 

17 

10“2 

10-9 

10-8 

10-2 

10-9 

10"2 

10-9 

10-14 

18 

10~2 

10-9 

10-8 

10~2 

10-9 

10"2 

10~9 

10-15 

16 


components  {^(z*)}  of  p*  in  terms  of  (u>(zfc)} 


k 

Ini 

Ini 

Ini 

Ini 

Ini 

Ini 

ln| 

Ini 

6 

10"3 

10"17 

IQ-16 

10-2 

10"u 

10_1 

10-15 

10° 

7 

10"3 

10-i« 

Iq-16 

10~2 

10~15 

10"1 

10~14 

101 

8 

10'3 

1(T17 

10-18 

10_i 

10-15 

10_1 

10“14 

102 

9 

10"3 

to 

H 

1 

© 

IQ-16 

10"2 

10-1$ 

10_1 

10~14 

103 

10 

10'3 

10-i« 

10-!6 

10"2 

10-15 

10"1 

10~14 

104 

11 

10"3 

10-17 

10-16 

10~2 

50 

1 

o 

^-4 

10-1 

10-14 

10s 

12 

10"3 

10“17 

10~18 

10~2 

1Q-1S 

10"1 

o 

1 

A 

106 

13 

10"3 

10-1« 

IQ-16 

10"2 

l0-!5 

10_1 

10~14 

107 

14 

10'3 

10-16 

IQ-16 

10"2 

10-15 

10"1 

10"14 

108 

15 

10"3 

10-16 

10-18 

10"2 

10-15 

10"1 

10~14 

109 

16 

10~3 

10-16 

IQ-16 

10-2 

o 

1 

*-* 

w 

10"1 

10"14 

io10 

17 

10~3 

10-16 

Iq-16 

10-2 

10-15 

10'1 

10-14 

1011 

18 

10-3 

l0-!6 

IQ-16 

»— • 
O 

1 

IQ-15 

10"1 

10‘14 

1012 

The  step  p’k  to  the  minimum  approaches  orthogonality  to  v8(zfc),  while  the  Gauss-Newton 
search  direction  becomes  dominated  by  the  component  in  the  direction  of  v8(xk)  due  to  the 
ill-conditioning  in  the  Jacobian.  Hence,  by  iteration  18,  p*  is  almost  orthogonal  to  pk.  The 
question  of  when  to  say  that  J  has  rank  7  rather  than  rank  8  is  a  difficult  one.  If  full  col¬ 
umn  rank  is  assumed  until  the  search  direction  becomes  numerically  orthogonal  to  the  gradient 
then  the  method  may  become  very  inefficient  (see  iterations  6  -  18,  where  about  seven  function 
evaluations  are  required  per  iteration).  On  the  other  hand,  if  the  step  to  the  minmum  has  a 
component  in  the  estimated  null  space  null(J),  underestimating  rank(J)  will  inhibit  decrease 
in  null(J),  because  the  Gauss-Newton  search  direction  will  be  orthogonal  to  null(J). 

5.3.  Matrix  Square  Root  1  n  =  m  =  4  (#  36a.) 

Another  instance  in  which  Gauss-Newton  methods  encounter  ill-conditioned  Jacobians  is  the 
problem  of  finding  the  square  root  of  a  given  (square)  matrix  (see  the  Appendix).  Although  the 
matrix  in  question  is  only  of  order  2,  the  problem  is  a  difficult  one  for  the  unconstrained  methods, 
as  shown  in  the  table  below. 
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NPSOL 


DHNG 


MNA  DKNH 


/  evals. 

4001 

4001 

4000 

4000 

786 

2618 

4000 

4000 

J  evals. 

4001 

4001 

2190 

2190 

786 

2618 

2891 

2891 

iters. 

2663 

2663 

2190 

2190 

477 

1437 

2891 

2891 

Ik-lli 

50.4 

50.4 

17.8 

17.8 

9.22 

10.1 

17.0 

17.0 

IIZ-lli 

IO'9 

10~9 

10"6 

10"® 

10-4 

10's 

10"® 

10"® 

uni. 

10'9 

10-9 

10"® 

10"® 

10_* 

IO"7 

10"® 

10"® 

eat.  err. 

io-19 

IO"19 

W 

w* 

1 

o 

s-H 

10~12 

IO'9 

h-» 

o 

1 

10_n 

10"n 

conv. 

p  lim. 

p  lim. 

P  LIM. 

P  LIM. 

P  LIM. 

P  LIM. 

MNA  is  just  Newton's  method  in  this  case,  since  the  exact  Hessian  matrix  is  never  modified, 
although  it  does  become  ill-conditioned,  with  a  condition  number  of  order  1011  at  the  solution.  In 
the  Gauss- Newton  methods,  the  Jacobian  does  becomes  ill-conditioned,  but  unlike  the  previous 
problem,  a  solution  is  obtained  only  when  the  Jacobian  is  assumed  to  have  full  rank  at  each 
iteration.  A  summary  of  the  results  for  t  =  10~10  and  e  <  10~n  are  given  in  the  following  table. 


Gauss-Newton 

li 

►-* 

O 

1 

e- 

O 

€  <  10" 

/,  J  evals. 

4004 

95 

iters. 

473 

39 

n 

101 

50.0 

iiz-iia 

IO"7 

to 

1 

o 

iinii 

10'® 

IQ-15 

eat.  err. 

1Q-IS 

IO'33 

coav. 

P  LIM. 

The  next  two  tables  trace  the  iterations  of  the  Gauss- Newton  method  for  e  =  10-10  and  c  = 
IO'11,  respectively. 
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Gauss-Newton  on  Problem  36a 


€  =  IO"10 


Jfc 

evals. 

m2 

IIAII, 

llAlla 

Ib/kllj 

fkPk 

Qfc 

cond 

A 

rank 

Jk 

0 

2 

1 .  E+00 

2 . 8+00 

3 . E+00 

9.8-01 

-3 .E+00 

1  ..OE+OO 

10° 

4 

1 

3 

9.E-01 

fl.E-01 

6.E-01 

8.E-01 

-4.E-01 

l.OE+OO 

101 

4 

2 

5 

1.E+00 

4.E-01 

7.E-01 

1.8+00 

-1.8-01 

6.6E-01 

102 

4 

3 

7 

2.8+00 

3.8-01 

8.E-01 

2.8+00 

-8.8-02 

4.68-01 

103 

4 

4 

9 

3 .  E+00 

2.E-01 

8.E-01 

2.8+00 

-6.8-02 

4.0E-01 

104 

4 

5 

11 

4 .  E+OO 

2.8-01 

9.E-01 

3.8+00 

-3.8-02 

3.7E-01 

104 

4 

6 

13 

8. E+00 

2.E-01 

1 .E+00 

3.8+00 

-2.8-02 

3.4E-01 

10s 

4 

7 

15 

6.  £4-00 

l.E-01 

1 . E+00 

4. E+00 

-2.8-02 

3.38-01 

10s 

4 

8 

17 

7. E+00 

i.E-01 

1 . E+00 

4. E+00 

-1.8-02 

3.2E-01 

106 

4 

9 

19 

8. E+00 

l.E-01 

1.E+00 

6.8+00 

-1.8-02 

3.1E-01 

106 

4 

10 

22 

l.E+01 

l.E-01 

1.8+00 

8.8+00 

-l.E-02 

2.0E-01 

107 

4 

11 

25 

l.E+01 

9.E-02 

1 . 8+00 

6.8+00 

-8.8-03 

1.8E-01 

107 

4 

12 

28 

l.E+01 

8.E-02 

1 .E+00 

7. E+00 

-7.8-03 

1.7E-01 

107 

4 

13 

31 

l.B+01 

7.E-02 

1.8+00 

7 . 8+00 

-6.8-03 

1.68-01 

107 

4 

14 

34 

l.E+01 

7.E-02 

1 . E+OO 

8.8+00 

-6.E-03 

1.6E-01 

108 

4 

15 

37 

2.E+01 

6.E-02 

1 . E+00 

8. E+00 

-4.8-03 

1.6E-01 

10s 

4 

16 

40 

2.E+01 

6 . 8-02 

1.8+00 

8.8+00 

-3.B-03 

1.4E-01 

10s 

4 

17 

43 

2.E+01 

6.8-02 

1.8+00 

9.8+00 

-3.8-03 

1.4E-01 

108 

4 

18 

46 

2.E+01 

6.E-02 

1.8+00 

9.8+00 

-3.E-03 

1.4E-01 

108 

4 

19 

49 

2.B+01 

6.E-02 

1 . 8+00 

9.8+00 

-2.E-03 

1.3E-01 

108 

4 

20 

52 

2.E+01 

4.8-02 

1 . 8+00 

1.8+01 

-2.E-03 

1.3E-01 

109 

4 

21 

55 

2.E+01 

4.E-02 

l.E+OO 

l.E+01 

-2.E-03 

1.3E-01 

109 

4 

22 

58 

2.E+01 

4.8-02 

1.8+00 

1.8+01 

-l.E-03 

1.3E-01 

109 

4 

23 

61 

3.8+01 

4.8-02 

1.8+00 

1.8+01 

-1.8-03 

1.3E-01 

109 

4 

24 

64 

3.8+01 

3.8-02 

1.8+00 

1.8+01 

-1.8-03 

1.3E-01 

109 

4 

25 

67 

3.B+01 

3.8-02 

1.8+00 

1.8+01 

-1.8-03 

1.38-01 

10® 

4 

26 

70 

3.B+01 

3.8-02 

1.8+00 

1.8+01 

-9.8-04 

1.48-01 

10® 

4 

27 

73 

3.B+01 

3.8-02 

1.8+00 

1.8+01 

-7.8-04 

1.4E-01 

109 

4 

28 

76 

3.E+01 

3.8-02 

1.8+00 

1.8+01 

-6.8-04 

1.68-01 

1010 

4 

29 

79 

3.E+01 

3.8-02 

1.S+00 

1.8+01 

-6.8-04 

1.68-01 

1010 

4 

30 

82 

4.E+01 

2.8-02 

1.8+00 

9.8+00 

-6.8-04 

1.78-01 

1010 

4 

31 

85 

4.E+01 

2.8-02 

1.8+00 

9.8+00 

-4.8-04 

1.98-01 

1010 

4 

32 

86 

4.E+01 

2.8-02 

1.8+00 

3.8-04 

-3.8-04 

l.OE+OO 

1010 

3 

33 

93 

4.8+01 

9.8-08 

4.8-08 

6.8+00 

-8.8-16 

2.18-04 

1010 

4 

34 

98 

4.8+01 

9.8-08 

4.8-08 

6.8+00 

-8.8-16 

9.98-06 

1010 

4 

35 

103 

4.8+01 

9.8-08 

4.8-08 

6.8+00 

-6.8-16 

9.98-06 

io10 

4 

36 

108 

4.8+01 

9.8-08 

4.8-08 

6.8+00 

-S.E-16 

9.9E-06 

1010 

4 

470 

3986 

4.8+01 

9.8-08 

4.8-06 

6.8+00 

-8.8-16 

2.2E-06 

IO10 

4 

471 

3995 

4.8+01 

9.8-08 

4.8-06 

6.8+00 

-8.8-16 

2.2E-06 

IO10 

4 

472 

4004 

4.8+01 

4.E+01 

9.8-08 

9.8-08 

4.8-06 

4 . 8-06 

6.8+00 

-8.8-16 

2.28-06 

IO10 

4 
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Gauss-Newton  on  Problem  36a 


f  = 


k 

ev&ls. 

11**11, 

IIMI, 

\\9k\\2 

0 

2 

1 . E+00 

2. E+00 

3 . E+00 

1 

3 

9.E-01 

6.E-01 

6.E  01 

2 

5 

1 .  E+00 

4.E-01 

7.E-01 

3 

7 

2.  E+OO 

3.E-01 

8.E-01 

4 

9 

3. E+00 

2.E-01 

8.E-01 

5 

11 

4.  E+00 

2.E-01 

9.E-01 

6 

13 

S . E+00 

2.E-01 

1 . E+00 

7 

15 

6. E+00 

l.E-01 

1.E+00 

8 

17 

7. E+00 

l.E-01 

1 . E+00 

9 

19 

8. E+00 

l.E-01 

1 . E+00 

10 

22 

l.E+01 

l.B-01 

1 . E+OO 

11 

25 

l.B+01 

9 . E-02 

1 . E+00 

12 

28 

l.E+01 

8.E-02 

1 . E+00 

13 

31 

l.E+01 

7. E-02 

1 . E+00 

14 

34 

l.E+01 

7. E-02 

1 . E+OO 

15 

37 

2.E+01 

8. E-02 

1 .E+00 

16 

40 

2.E+01 

8 . E-02 

1.B+00 

17 

43 

2.E+01 

5. E-02 

1 . E+00 

18 

46 

2.E+01 

6. E-02 

1.B+0O 

19 

49 

2.E+01 

5. E-02 

1 . E+00 

20 

52 

2.E+01 

4. E-02 

1 . E+OO 

21 

55 

2.B+01 

4. E-02 

1 . B+OO 

22 

58 

2.E+01 

4.B-02 

1 .8+00 

23 

61 

3.E+01 

4. E-02 

1 . E+00 

24 

64 

3.E+01 

3. E-02 

1.E+00 

25 

67 

3.E+01 

3. E-02 

1 . E+OO 

26 

70 

3.E+01 

3. E-02 

1 .E+00 

27 

73 

3.E+01 

3. E-02 

1 .E+00 

28 

76 

3.E+01 

3. E-02 

1 . E+00 

29 

79 

3.E+01 

3.B-02 

1 . E+00 

30 

82 

4.B+01 

2 . E-02 

1.E+00 

31 

85 

4.E+01 

2.B-02 

1 . E+00 

32 

87 

4.B+01 

2. E-02 

1 .  E+00 

33 

89 

4.E+01 

2 .  E-02 

9.E-01 

34 

91 

4.E+01 

l.B-02 

8.B-01 

35 

92 

6.E+01 

1  .E-02 

6.E-01 

36 

93 

S.E+01 

4.E-03 

3.E-01 

37 

94 

S.E+01 

l.E-05 

7.E-04 

38 

95 

S.E+01 

S.E+01 

2.E-11 

6.E-17 

l.E-09 

4.E-1S 

IO"11 


IM 

9kP* 

Ok 

eond 

rank 

Jk 

Jk 

9.E-01 

-3. E+OO 

l.OE+OO 

10° 

4 

8.E-01 

-4.E-01 

1.0 E+00 

101 

4 

1 . E+00 

-l.B-01 

5.5E-01 

102 

4 

2. E+00 

-8. E-02 

4.SE-01 

103 

4 

2. E+00 

-6. E-02 

4.0E-01 

104 

4 

3. E+00 

-3. E-02 

3.7E-01 

104 

4 

3. E+00 

-2. E-02 

3.4E-01 

105 

4 

4. E+00 

-2. E-02 

3.3E-01 

10s 

4 

4 . E+00 

-1 .E-02 

3.2E-01 

10® 

4 

6. E+00 

-1 .E-02 

3.1E-01 

10® 

4 

8. E+00 

-l.B-02 

2.0E-01 

107 

4 

8 . E+00 

-8.E-03 

1.8E-01 

107 

4 

7. E+00 

-7.E-03 

1.7E-01 

107 

4 

7. E+00 

-8.E-03 

1.6E-01 

107 

4 

8. E+00 

-6 . E-03 

1.5E-01 

10® 

4 

8. E+00 

-4.E-03 

1.6E-01 

10® 

4 

8. E+00 

-3. E-03 

1.4E-01 

10® 

4 

9.B+00 

-3.B-03 

1.4E-01 

10® 

4 

9. E+00 

-3. E-03 

1.4E-01 

10® 

4 

9. E+OO 

-2. E-03 

1.3B-01 

10® 

4 

l.E+01 

-2. E-03 

1.3E-01 

10® 

4 

l.B+01 

-2.B-03 

1.38-01 

10® 

4 

1 .  B+01 

-1 . E-03 

1.3E-01 

10® 

4 

l.E+01 

-1 . E-03 

1.3E-01 

10® 

4 

l.E+01 

-1 .E-03 

1.3E-01 

10® 

4 

l.E+01 

-1 .E-03 

1.3E-01 

10® 

4 

l.E+01 

-9.E-04 

1.4E-01 

10® 

4 

l.B+01 

-7.B-04 

1.4E-01 

10® 

4 

l.E+01 

-8.E-04 

l.SE-01 

1010 

4 

l.B+01 

-8.E-04 

1.6E-01 

1010 

4 

9.  E+00 

-5.B-04 

1 .78-01 

1010 

4 

9.  B+OO 

-4.E-04 

1.9E-01 

1010 

4 

8.B+00 

-3.B-04 

3. 28-01 

1010 

4 

7.  B+OO 

-3.E-04 

3.8E-01 

1010 

4 

6.B+00 

-2.E-04 

6.3E-01 

io10 

4 

3. E+00 

-l.E-04 

1 . OE+OO 

io10 

4 

3.E-01 

-1.B-0S 

1 .  OE+OO 

10u 

4 

8.E-04 

-1.B-J0 

l.OE+OO 

10n 

4 

l.B-09 

-4.E-22 

l.OE+OO 

1011 

4 
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The  first  difference  between  the  two  methods  occurs  at  iteration  33.  Data  available  from 


the  SVD  at  the  start  of  the  iteration  is  shown  in  the  following  table. 


r 

Or 

€  <  IO'10; 

krl 

iteration  33 

\\Prh 

\fPr\ 

|cos(ff,pr)| 

1 

102 

10-4 

10" 4 

IO"4 

10° 

2 

102 

10-4 

IO"4 

10~3 

0 

0 

3 

IO-2 

10-1S 

10-4 

IO-3 

10° 

4 

10"* 

101 

101 

IO-3 

10-5 

The  case  for  saying  that  rank(J)  =  3  appears  to  be  fairly  strong.  There  is  a  large  gap  between 
<74  and  <73,  and  |co«(y,p4)|  is  significantly  smaller  than  |coa(p,p3)|.  Moreover,  it  would  appear 
that  the  step  taken  when  c  —  IO-10  and  rank(J)  =  3  is  better,  in  the  sense  that  the  reduction 
in  the  values  of  both  ||/||3  and  ||^||2  is  appreciably  greater  than  the  reduction  achieved  when 
(  =  10-11  and  rank(J)  =  4.  On  the  other  hand,  |p4(  is  not  especially  large  for  either  choice 
of  rank.  For  (  =  IO-10,  the  algorithm  subsequently  makes  unacceptably  slow  progress,  while  for 
f  =  10"u,  quadratic  convergence  occurs  after  a  few  more  iterations 

To  see  why  no  further  progress  can  be  made  for  (  =  IO-10,  consider  the  following  table  of 
information  on  the  state  of  the  method  at  the  start  of  iteration  34 


c  <  IO-10; 

iteration  34 

r 

Or 

krl 

IIPr||2 

!i?TPr| 

|cOs(ff.Pr)| 

1 

102 

IO'9 

IO"4 

10'4 

10° 

2 

102 

10-9 

10-4 

IO"4 

10° 

3 

10"2 

10-16 

10-4 

10-4 

10° 

4 

IO-8 

101 

101 

10-4 

1G-S 

The  singular  values  are  nearly  the  same  as  those  of  the  previous  iteration,  but  the  change  is 
enough  to  have  rank(J)  -  4  rather  than  rank(J)  =  3  according  to  (5.3).  The  value  of  ||/||2 
has  decreased  significantly  after  iteration  33  :  |rj|  and  |r2|,  which  were  the  dominant  components 
just  prior  to  iteration  33,  are  much  smaller  at  the  start  of  iteration  34,  although  |r3(  and  |r4| 
are  essentially  unchanged.  As  a  consequence,  ||p4||2  is  now  very  large  relative  to  | jp3 [l2 • 
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|cos((7,  p4)|  is  small  since  t>4  is  close  to  being  orthogonal  to  g.  In  fact,  if  (5.3)  is  disregarded  and 
rank(J)  forced  to  be  3,  the  method  will  converge  to  a  local  minimum  in  one  step 

As  in  the  previous  section,  we  compare  the  sequence  {pj}  of  steps  from  the  iterates  to  the 
minimum  of  the  function,  to  the  sequence  {pk}  of  Gauss-Newton  steps. 


components  {rj"(xk)}  °f  P*  =  x*  -  x*  in  terms  of 


k 

l*jl 

lr3*l 

k; 

28 

IO-3 

IO"3 

10-,s 

io1 

29 

1CT3 

10"3 

10-J6 

IO1 

30 

IO"3 

10"3 

10-14 

10' 

31 

IO-3 

10"3 

IO"14 

101 

32 

i<r3 

10"3 

10"15 

101 

components  {r3 

(**)}  of  pk  in 

terms  of  (t'j(xfc)} 

k 

lril 

l^l 

N 

|r4' 

28 

IO"4 

10"4 

IO"15 

101 

29 

1(T4 

10"4 

io-15 

101 

30 

1(T4 

10"4 

io-15 

101 

31 

i(r4 

10"4 

10"14 

101 

32 

10~4 

10"4 

IO"15 

101 

Taking  rank(J)  =  3  is  a  bad  strategy,  in  this  case,  because  the  solution  lies  mainly  in  the 
direction  of  t'4(xk). 

5.4.  Watson  n  =  20;  m  =  31  (#  20d.) 

The  final  example  for  this  section  is  a  problem  that  might  seem  to  be  very  hard  for  Gauss- 
Newton  methods.  In  Watson's  problem  [Brent  (1973);  More,  Garbow,  and  Hillstrom  (1981)],  a 
polynomial  of  degree  n  is  fitted  to  approximate  the  solution  of  an  ordinary  differential  equation. 
The  Jacobian  matrix  for  n  =  20  has  singular  values  of  order  102,  101,  101,  10°,  10°.  10°,  10“ 1 , 
10_1,  10~2,  1CT2,  10“3,  10-4.  10-\  10~5,  10"*,  IO-7,  10-8,  10“9  10~u,  and  10”12  at  the 
origin  Yet  there  is  very  little  difficulty  in  obtaining  a  solution,  starting  lio~>  iq  —  0.  for  a  wide 
range  of  values  of  e,  as  shown  in  the  table  below. 
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Gauss-Newton 


€ 

1(T8 

IO-9 

10-1° 

10~u;  IO-12 

10-13 

>  10~14 

/.  J  evals. 

6 

6 

6 

6 

6 

6 

iters. 

5 

5 

5 

5 

5 

5 

11*1, 

l.or 

1.11 

1.55 

5.21 

29.2 

247. 

uni, 

IO-8 

IO-8 

10-9 

10~9 

l0-i° 

1CT10 

urn, 

IO-14 

IO"14 

IO-14 

io-12 

10-14 

IO"12 

Gauss- Newton  compares  favorably  on  this  problem  with  results  for  the  unconstrained  methods, 
which  are  summarized  in  the  next  table. 


KNA  DKNH  MPSOL  DMNG 


/  evals 

(352) 

(251) 

50 

(149) 

76 

200 

110 

134 

J  evals. 

(352) 

(251) 

27 

(56) 

76 

200 

108 

119 

iters 

(189) 

(135) 

26 

(55) 

38 

99 

107 

119 

11*1, 

106 

106 

1.10 

1.16 

1.06 

1.06 

1.06 

1.06 

uni, 

10~3 

10'3 

10“8 

00 

l 

O 

rH 

10~4 

vO 

1 

O 

rH 

IO"6 

10~7 

1151, 

10“5 

10'5 

10~13 

10'13 

10~5 

IO-8 

io-11 

io-12 

est  err 

1(T5 

10'5 

10-16 

10-18 

10“8 

io-11 

) — • 
o 

1 

to 

IO-13 

conv. 

TIME 

TIME 

LOOP 

In  NNA,  the  Hessian  matrix  is  nearly  singular  (but  not  indefinite)  at  every  iteration,  with  condition 
number  ranging  from  1011  to  1015,  and  it  is  modified  at  every  step.  The  trust-region  algorithm 
DMNH,  which  also  uses  exact  second  derivatives,  loops  for  some  values  of  the  parameters  in  the 
termination  criteria. 

Watson's  problem  has  a  number  of  local  minima,  so  that  the  value  of  the  Gauss- Newton 
solution  is  dependent  on  f.  Nothing  can  be  said  concerning  which  of  the  local  minima  is  the 
"better”  one  without  knowing  how  the  solution  is  going  to  be  used.  For  the  larger  values  of  e, 
solutions  are  obtained  that  are  small  in  magnitude  and  hence  closer  to  the  starting  value,  because 
lower  values  of  the  rank  restrict  the  size  of  the  search  directions.  On  the  other  hand,  the  final 
value  of  the  sum  of  squares  is  smaller  for  smaller  values  of  c,  because  the  objective  function  is 
being  decreased  in  a  larger  subspace  at  each  step.  Details  of  the  Gauss-Newton  iterations  are 
given  below. 
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Gauss-Newton  on  Problem  20d 


k 


0 

1 

2 

3 

4 


0 

1 

2 

3 

4 


0 

1 

2 

3 

4 


0 

1 

2 

3 

4 


0 

1 

2 

3 

4 


0 

1 

2 

3 

4 


evais. 


2 

3 

4 

5 

6 


2 

3 

4 

5 

6 


2 

3 

4 

5 

6 


2 

3 

4 

5 

6 


2 

3 

4 

5 

6 


2 

3 

4 

5 

6 


ll**ll2 

IIAIIj 

M, 

f  : 

=  10"8 

0.E+00 

S . E+00 

2.E+02 

1.E+00 

1.E+00 

3.  E+00 

1  .£+02 

4.E-01 

1.E+00 

4.E-01 

2.E+01 

6.E-02 

1.E+00 

2.E-03 

l.E-01 

6.E-02 

1.E+00 

3 .  E-08 

6 .  E-07 

3.B-06 

1 . E+00 

3.E-08 

2.E-14 

€  = 

=  10"9 

0.E+00 

6. E+00 

2.E+02 

1 .E+00 

1 . E+00 

3 .  E+00 

1  .£+02 

4.E-01 

1 , E+00 

4.E-01 

2 .  £+01 

l.E-01 

1 . E+00 

2.E-03 

l.E-01 

2.E-01 

1 . E+00 

2.  E-08 

6.  E-07 

l.E-04 

1 .E+00 

1 .  E-08 

7.E-16 

€  = 

:  10-10 

0 . E+00 

8. E+00 

2.E+02 

1 .  E+OO 

1 . E+00 

3.  E+00 

l.E+02 

4.E-01 

1 .  E+00 

4.E-01 

2.E+01 

6.E-01 

1  .E+00 

2.E-03 

l.E-01 

7.E-01 

2. E+00 

1 .  E-08 

8.  E-07 

8 . E-04 

2.  E+00 

4.E-09 

2.E-14 

£=10 

-n;  10“ 

0 .  E+00 

5 . E+00 

2.E+02 

1 .  E+OO 

1 .  E+00 

3 . E+00 

1 . E+02 

4.E-01 

1  .E+00 

4.E-01 

2.E+01 

2.  E+00 

2.  E+00 

2.E-03 

l.E-01 

3.  E+00 

6.  E+00 

1 .E-08 

6. E-07 

4.E-03 

6 .  E+00 

l.E-09 

6.E-13 

£  = 

10-lS 

0.E+00 

5 . E+00 

2. E+02 

1 .E+00 

1  .E+00 

3. E+00 

l.E+02 

4.E-01 

1 .  E+00 

4.E-01 

2.E+01 

l.B+01 

1 . E+00 

2.E-03 

l.E-01 

2.E+01 

3.B+00 

1 . E-08 

8.B-07 

3.E-02 

3 . E+00 

8.E-10 

3.E-14 

£  = 

10~14 

0 . E+00 

8. E+00 

2. E+02 

1  .E+00 

1 . E+00 

3. E+00 

l.E+02 

4.E-01 

1 . E+00 

4.E-01 

2.E+01 

8.E+01 

8.E+01 

2.E-03 

l.E-01 

2. E+02 

2.E+02 

1 .  E-08 

6.  E-07 

3.E-01 

2.E+02 

2.E-10 

4.E-12 

-T 

9kPk 

<*k 

cond 

rank 

Jk 

Jk 

-3.E+01 

l.OE+OO 

1014 

15 

-6. E+00 

1 . OE+OO 

1013 

15 

-2.E-01 

l.OE+OO 

1013 

15 

-4.E-06 

l.OE+OO 

1013 

15 

-2.E-16 

1 . OE+OO 

1013 

15 

-3.E+01 

1 . OE+OO 

1014 

16 

-6 . E+OO 

1 . OE+OO 

1013 

16 

-2.E-01 

l.OE+OO 

1013 

16 

-4.E-06 

l.OE+OO 

1013 

16 

-2.E-16 

1 . OE+OO 

1013 

16 

-3.E+01 

1 . OE+OO 

1014 

17 

-6. E+OO 

1 . OE+OO 

1013 

17 

-2.E-01 

l.OE+OO 

1013 

17 

-4.B-06 

l.OE+OO 

1013 

17 

-2.E-16 

l.OE+OO 

1013 

17 

-3.E+01 

1 . OE+OO 

1014 

18 

-6. E+OO 

1 . OE+OO 

1013 

18 

-2.E-01 

l.OE+OO 

1013 

18 

-4. E-08 

1 . OE+OO 

1013 

18 

-2.E-18 

1 . OE+OO 

1013 

18 

-3.E+01 

l.OE+OO 

1014 

19 

-8. E+OO 

1 . OE+OO 

1013 

19 

-2.B-01 

1 . OE+OO 

1013 

19 

-4 . E-08 

1 . OE+OO 

1013 

19 

-2.B-16 

< . OE+OO 

1013 

19 

-3.E+01 

1 . OE+OO 

1014 

20 

-8 . E+OO 

l.OE+OO 

1013 

20 

-2.E-01 

1 . OE+OO 

1013 

20 

-4.E-06 

l.OE+OO 

1013 

20 

-2.E-16 

1 . OE+OO 

1013 

20 
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The  condition  number  of  the  Jacobian  remains  very  large  throughout,  yet  the  search  direc¬ 
tion  is  never  especially  large  regardless  of  the  choice  of  rank,  because  the  sequence  {|u7/|}  is 
monotonically  decreasing  at  about  the  same  rate  as  the  singular  values  (see  (5.2).  The  unit 
step  gives  sufficient  decrease  in  every  instance,  on  account  of  the  many  local  minima.  Moreover, 
there  is  superlinear  convergence  for  each  value  of  t,  despite  the  fact  that  p  becomes  very  close 
to  being  orthogonal  to  the  gradient,  with  |eoj(j,p)|  ranging  from  10-s  for  t  =  10~8,  to  10“9 
for  c  >  10“u  in  the  final  step. 

6.  An  Example  of  Poor  Performance 

on  a  Well-Conditioned  Zero-Residual  Problem 

On  problems  with  well-conditioned  Jacobians,  Gauss-Newton  methods  are  globally  conver¬ 
gent,  and  they  are  locally  quadratically  convergent  if  in  addition  the  residuals  vanish  at  the 
solution  (see  Section  2).  It  is  generally  believed  that  Gauss- Newton  methods  will  work  well  on 
zero-  or  small-residual  problems  in  which  the  Jacobian  is  never  ill-conditioned.  In  this  section, 
we  exhibit  a  zero-residual  problem  on  which  Gauss- Newton  performs  poorly,  although  cond(Jk) 
never  exceeds  5  x  103.  The  example  used  is  the  following  modification  of  Rosenbrock’s  Function 
[More,  Garbow,  and  Hillstrom  (1981),  p.  21]. 

Modified  Rosenbrock  Function  n  =  m  =  2 

<M*)  =  100(12  -  xj) 

<h(x)  =1-2-1 

*0  =  (0,0) 

(i£>)  -  (J)  •,(u) 

The  starting  point  (0, 0)  lies  at  the  bottom  of  a  curved  steep-sided  valley  in  which  the  solution 
(1, 1)  also  lies.  The  following  table  gives  the  results  for  Gauss-Newton  and  Newton's  method  on 
this  problem. 
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Modified  Rosenbrock  n  =  m  =  2;  xq  =  (0, 0) 


Gauss-Newton 

Newton’s  Method 

/,  J  evals. 

467 

77 

iters. 

100 

50 

11*% 

1.41 

1.41 

ll/% 

10-15 

10-13 

llrila 

10-13 

io-12 

est.  err. 

10-30 

10-26 

The  same  linesearch  is  used  here  for  both  methods  (see  Section  4).  Newton’s  method  can  be 
applied  without  modification,  since  the  Hessian,  as  well  as  the  Jacobian,  is  well-conditioned.  In 
this  case,  Gauss-Newton  is  Newton's  method  for  nonlinear  equations,  because  n  =  m.  Contour 
plots  of  the  progress  of  the  two  methods  are  displayed  on  the  pages  following  Section  7  of  this 
paper. 

The  minimum  of  the  Gauss- Newton  model  (2.1)  lies  well  outside  the  valley  in  which  the 
starting  value  and  minimum  are  located,  at  least  until  the  iterates  are  very  close  to  the  solution. 
The  univariate  function  $(a)  =  j|  f(xk  -I-  ap*)^  actually  has  a  maximum  at  a  =  1  for  a  6  [0, 1], 
rather  than  a  minimum  as  predicted  by  the  quadratic  model ;  moreover,  the  function  rises  very 
steeply  from  the  valley  floor  to  the  maximum.  Hence  a  significant  number  of  function  evaluations 
are  required  in  the  linesearch  in  order  to  minimize  $(a),  and,  initially,  rather  small  steps  are 
taken  along  the  search  directions.  Strategies  for  improving  the  efficiency  of  the  method  include 
decreasing  the  maximum  steplength  amM  and  relaxing  the  parameter  tj  that  controls  the  accuracy 
of  the  univariate  minimization  in  the  linesearch  (see,  for  example,  Gill,  Murray,  and  Wright 
[1981]).  For  example,  if  JV*  is  the  number  of  function  evaluations  required  to  determine  a*,  and 
the  following  scheme  is  used  to  define  a““ 


^“  =  7*(1  +  IM2). 


f  27*-i 


7  k  = 


l  7*-i 
7*-i/2 


70  =  1.0 

if  q*_,  = 

if  q*_,  and  JV*_,  <  2 

if  q*_i  ^  and  Nk-i  >  2, 
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then  the  Gauss-Newton  method  solves  the  problem  in  only  63  iterations  and  135  function  eval¬ 
uations  with  tj  —  0.5.  By  contrast,  the  relatively  efficient  performance  of  Newton’s  method  can 
be  explained  by  the  fact  that  the  minimum  of  the  Newton  quadratic  model  falls  very  near  the 
curve  along  the  valley  floor  connecting  (0,0)  to  (1, 1)  (which  is  followed  by  the  iterates  of  both 
methods),  at  all  iterations  except  the  first  one. 

7.  Conclusions 

Wt  have  examined  the  performance  of  some  Gauss-Newton  methods  on  specific  examples 
and  given  precise  explanations  of  the  observed  results  in  every  case.  From  some  of  these  examples, 
we  conclude  that  ill-conditioning  in  the  Jacobian  does  not  necessarily  imply  that  a  Gauss-Newton 
method  will  not  work  well,  but  that  there  appears  to  be  no  strategy  that  is  uniformly  best 
for  estimating  rank  in  the  linear-least  squares  subproblems.  We  give  another  example  showing 
that  Gauss-Newton  methods  may  not  necessarily  be  effective  on  well-conditioned  zero-residual 
problems.  Most  importantly,  we  have  demonstrated  that  it  is  necessary  to  look  at  details  of 
the  performance  of  Gauss-Newton  methods  in  order  to  make  meaningful  statements  about  their 
behavior. 
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Gauss-Newton  Method  on  the  Modified  Rosenbrock  Function 


starting  value  :  x0  =  (0,0) 
solution  :  x*  =  (1, 1) 

G  -  xk  +  pk 
♦  -  Xk  +  <*kPk 
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Newton’s  Method  on  the  Modified  Rosenbrock  Function 


starting  value  :  xo  =  (C,  0) 
solution  :  i*  =  (1, 1) 

M  -  *k  +  Pk 
+  -  Xk  +  OtkPk 
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8.  Appendix:  Numerical  Results 

for  some  Gauss-Newton  Methods 


8.1.  Software  and  Algorithm 

In  this  section  numerical  results  are  given  for  the  test  problems  described  in  the  previous 
subsection.  The  software  package  LSSOL  [Gill  et  al.  (1986a)]  is  used  to  solve  the  linear  least- 
squares  subproblem  (2.3).  The  linesearch  procedure  used  for  the  numerical  examples  in  this 
section,  and  also  in  Sections  5  and  6,  requires  both  function  and  gradient  information.  It  is  taken 
from  the  nonlinear  programming  code  MPSOL  [Gill  et  al.  (1979);  (1986b)].  Numerical  results  for 
the  same  set  of  test  problems  using  widely-distributed  software  for  unconstrained  optimization 
and  nonlinear  least  squares  can  be  found  in  Fraley  [1987]. 


8.2.  Parameters 

Parameters  in  LSSOL  were  kept  at  their  default  values  with  the  following  exceptions  : 

Rank  Tolerance  -  varied,  see  tables 
Infinite  Bound  Size  -  1020 

See  Gill  et  al.  [1986a]  for  details  concerning  the  parameters. 

In  addition,  the  following  parameters  are  chosen  for  the  linesearch  : 

i?  -  0.5 

<W  -  min  {(100(1  +  ||*||2)+1)/  ||p||2,i020}  f 

t  In  some  cases  the  default  value  Qmu  was  too  large  and  overflow  occurred  during  function  evaluation 
in  the  lineeearch.  These  caaes  are  indicated  in  the  tables  by  giving  the  value  7  <  100  such  that 
<>mu  =  min((7(l  +  ||x||j)  4-1)/  UpIIj  ,  1020}  that  was  subsequently  used  to  obtain  the  results  in 
the  column  labeled  “step  fac”. 

See,  for  example.  Gill,  Murray,  and  Wright  [1981]  for  a  discussion  of  the  linesearch  parameters. 


8.3.  Convergence  Criteria 

Convergence  is  judged  to  have  occurred  at  the  fcth  iterate  if  either 

!!/*!!,<  <0*9  (8.1) 
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or 


(8.2) 


||s*ll2<^/3(i  +  IIAII2). 

The  algorithm  is  also  terminated  if  there  is  a  negligible  change  in  x, 

a*ilf>*ll2  <<°*9(1  +  IM2).  (8-3) 

where  ak  is  the  step  length  determined  by  the  linesearch. 


8.4.  Table  Information 


Under  the  label  ‘conv.’,  the  following  notation  is  used  to  describe  conditions  under  which 
the  algorithm  terminates  : 

ABS  F  -  (8.1) 

g  -  (8.2) 

x  -  (8.3) 

F  lim  •  function  evaluation  limit  reached 


Under  the  label  ‘est.  err.',  we  include  the  quantity 


ll/lla  ~  II Am.  II j 

1  +  ll/ke.rllj 


(8-4) 


where  /*  is  the  value  of  /  at  the  point  of  termination,  and  ||/a««t||3  is  the  best  available  estimate 
of  the  norm  of  the  solution,  in  order  to  get  some  idea  of  the  error  in  ||/'||2.  For  those  problems 
that  have  nonzero  residuals,  the  value  of  ||/6**t||2  •*  given  to  six  figures  of  accuracy,  rounded 
down. 


A  superscript  0  following  a  problem  number  indicates  a  zero--esidual  problem. 

A  superscript  1  following  a  problem  number  denotes  a  linear  least-squares  problem. 

For  further  details  on  the  numerical  tests,  see  Section  4.  Information  on  the  test  problems  is 

» 

given  in  the  next  section. 
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26 

60.3 

10”13 

10_n 

BH 

o.  X 

2  23x10'“ 

EH 

33 

26 

60.3 

10-13 

10- 11 

tm 

O.  X 

42d.° 

4 

24 

1  49X10'* 

0.1 

27 

23 

10-M 

10" 11 

10- 3S 

o.  X 

2  23x10'“ 

0.1 

27 

23 

10”14 

10-11 

IO-38 

O  X 

43a. 0 

5 

16 

1  49x10'* 

1.0 

22 

14 

54.0 

10"14 

IO-11 

10-37 

o 

2  23x10'“ 

1.0 

22 

14 

54.0 

10"14 

10-n 

IO-37 

o 

43b. 0 

5 

16 

1.49x10'* 

1.0 

warm 

392 

62.1 

IQ'1 

10- 11 

10-3 

Q 

2  23x10'“ 

1.0 

BOB 

392 

62.1 

io-1 

10-11 

10-3 

o 

43c. 0 

5 

16 

1  49x 10'* 

1.0 

23 

14 

54.0 

10"14 

10-13 

10- 38 

◦ 

2  23x10''* 

1.0 

23 

14 

54.0 

10-14 

10“13 

10- 38 

o 

43d. 0 

5 

16 

1  49x10'* 

1.0 

mm 

54.0 

10-14 

10"13 

10-37 

o 

2  23x10'“ 

1.0 

BH 

54.0 

IO"14 

10-13 

10-37 

a 

43e.° 

5 

16 

1  49x10'* 

1.0 

mm 

19 

54.0 

10~14 

10- 11 

10-37 

Q 

2  23x10'“ 

1.0 

mm 

19 

54.0 

10“14 

10-11 

IO-37 

o 

43f.° 

5 

16 

1  49x10'* 

EZ1 

20 

11 

54.0 

IO"14 

10-13 

10-37 

o 

2  23x10'“ 

in 

20 

11 

54  0 

10~14 

10-13 

10-37 

o 

44a. 0 

6 

6 

1  49x10'* 

■n 

BO 

10~14 

10-13 

10-37 

o 

2  23x10'“ 

ii 

HI 

10-14 

10-'3 

10-37 

o 

44b. 0 

6 

6 

1  49x10'* 

5 

4 

3.52 

10-15 

IO"13 

10-39 

a*s  r  a 

2  23x  10'“ 

5 

4 

3.52 

10-15 

10-13 

10-36 

A  S3  P  O 

44c. 0 

6 

6 

1  49x 10'* 

■a 

20.6 

10~14 

10- 11 

Baa 

Afcs  r 

2  23x  10'“ 

KB 

20.6 

IO”14 

10-11 

IEB9 

aBS  P 

44d.° 

6 

6 

1  49 x 10'* 

36 

15 

15.3 

10_i4 

10-n 

10-  ,p 

a  as  p  o 

2  23x10'“ 

36 

15 

15.3 

IO"14 

10-“ 

IO"39 

ABS  P  O 

44e.° 

6 

6 

1  49x 10'* 

70 

EEI 

10-15 

10-13 

IO-39 

ABS  P  0 

2  23x  10'“ 

70 

10-15 

10-13 

10-39 

ABS  P  O 

45a.° 

8 

8 

1  49x 10'* 

125 

29 

4.06 

10“ 14 

10*!3 

10-37 

o 

2  23x10'“ 

125 

29 

4.06 

10~14 

10-13 

10-37 

o 

45b. 0 

8 

8 

1  49x10'* 

5 

4 

3.56 

10-15 

IO-13 

10-39 

abs  p  a 

2  23x10'“ 

5 

4 

3.56 

10-15 

10-13 

10-39 

ass  p  a 

45c. 0 

8 

8 

1  49x10'* 

gPg 

20.6 

IO'14 

IO"11 

10- 39 

ABS  P 

2  23x10'“ 

SBB 

20  6 

IO'14 

10-“ 

10-39 

ABS  P 

45d.° 

8 

8 

1  49x10'* 

15 

15.3 

10-15 

10-” 

10- 39 

ABS  P  O 

2  23x10'“ 

15 

15  3 

10-15 

10-11 

10-39 

ABS  P  O 

45e.° 

8 

8 

1  49x 10'* 

70 

23 

9  31 

10~14 

10-11 

G 

2  23x10'“ 

70 

23 

9  31 

10'14 

10-n 

10- 38 

o 

36 


8.5.  Test  Problems 


Superscripts  on  problem  numbers  have  the  following  interpretation  : 

0  :  zero-residual  problem 
L  :  linear  least-squares  problem 


Problems  from  More,  G&rbow,  and  Hillstrom  [1981] 


l.° 

n 

2 

m 

2 

Rosenbrock 

2.° 

2 

2 

Freudenstein  and  Roth 

3.° 

2 

2 

Powell  Badly  Scaled 

4.° 

2 

3 

Brown  Badly  Seeded 

5.° 

2 

3 

Besde 

6. 

2 

10 

Jennrich  and  Sampson 

7.° 

3 

3 

Helical  Valley 

8. 

3 

15 

Bard 

9. 

3 

15 

Gaussian 

10. 

3 

16 

Meyer 

11.° 

3 

10 

Gulf  Research  and  Development! 

12.° 

3 

10 

Box  3- Dimensioned 

13.° 

4 

4 

Powell  Singular 

14.° 

4 

6 

Wood 

15. 

4 

11 

Kowalik  and  Osborne 

16. 

4 

20 

Brown  and  Dennis 

17. 

5 

33 

Osborne  1 

18.° 

6 

13 

Biggs  EXP6f 

t  For  the  Gulf  Research  and  Development  Function  (#  11),  the  formula 

|V»  mizjj13' 


=  exp 


X\ 


-  U 


given  in  More,  Garbow,  and  Hillstrom  [1981]  for  the  residual  functions  is  in  error.  The  correct  formula 


=  exp 

(see  More,  Garbow,  and  Hillstrom  [1978]). 


|y«  -  x2\X3 


X\ 


-  ti 


{  For  the  Biggs  EXP6  Function  (#  18),  the  minmum  value  for  the  sum  of  squares  is  given  in 
More.  Garbow.  and  Hillstrom  [1981]  as  5.65565  . .  .X  10-3.  It  can  be  easily  verified  that  the  residuals 
vanish  at  several  points  (for  example  (1, 10, 1,5, 4, 3)). 
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Problems  from  More,  Garbow,  and  Hillstrom  [1981]  (continued) 


19. 

n 

11 

m 

65 

Osborne  2f 

20a. 

6 

31 

Watson 

20b. 

9 

31 

Watson 

20c. 

12 

31 

Watson 

20d. 

20 

31 

Watson 

21a.° 

10 

10 

Extended  Rosenbrock 

21b.° 

20 

20 

Extended  Rosenbrock 

22a. 0 

12 

12 

Extended  Powell  Singular 

22b.° 

20 

20 

Extended  Powell  Singular 

23a. 

4 

5 

Pensdty  I 

23b. 

10 

11 

Penalty  I 

24a. 

4 

8 

Penalty  II 

24b. 

10 

20 

Penalty  II 

25a.° 

10 

12 

Variably  Dimensioned 

25b.° 

20 

22 

Variably  Dimensioned 

26a.° 

10 

10 

Trigonometric 

26b.° 

20 

20 

Trigonometric 

27a.° 

10 

10 

Brown  Almost  Linear 

27b.° 

20 

20 

Brown  Almost  Linear 

28a. 0 

10 

10 

Discrete  Boundary  Value 

28b. 0 

20 

20 

Discrete  Boundary  Value 

29a.° 

10 

10 

Discrete  Integral 

29b.° 

20 

20 

Discrete  Integral 

30a.° 

10 

10 

Broyden  Tridiagonal 

30b.° 

20 

20 

Broyden  Tridiagonal 

31a. 0 

10 

10 

Broyden  Banded 

31b.° 

20 

20 

Broyden  Banded 

32. 1 

10 

20 

Linear  —  Full  Rank 

33. 1 

10 

20 

Linear  —  Rank  1 

34/ 

10 

20 

Linear  —  Rank  1  with  Zero  Columns  and  Rows 

35a. 

8 

8 

Chebyquad 

35b.° 

9 

9 

Chebyquad 

35c. 

10 

10 

Chebyquad 

t  For  Osborne’s  Second  Function  (#  19),  the  value  of  f(xm)  is  given  (to  six  figures)  in  More, 
Garbow,  and  Hillstrom  [1981]  as  4.01377  X  10-2.  The  smallest  value  we  were  able  to  obtain  was 
4.01683  x  10"2. 
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Matrix  Square  Root  Problems 


n 

m 

36a.° 

4 

4 

Matrix  Square  Root  1 

36b.° 

9 

9 

Matrix  Square  Root  2 

36c.° 

9 

9 

Matrix  Square  Root  3 

36d.° 

9 

9 

Matrix  Square  Root  4 

These  test  problems  come  from  a  private  communication  of  S.  Hammarling  to  P.  E.  Gill  in 
1983. 


MATRIX 


SQUARE  ROOT 


•  The  identity  matrix  was  UBed  as  the  starting  value  in  all  instances.  Note  that  the  iteration 
should  not  be  started  with  the  zero  matrix  because  it  is  a  stationary  point  of  the  sum  of 
squares. 


Problems  from  Salane  [1987] 
n  m 

37.  2  16  Hanson  1 

38.  3  16  Hanson  2 


Problems  from  McKeown  [1975a]  (also  McKeown  [1975b]) 


n 

m 

39a. 

2 

3 

McKeown  1 

0.001 

39b. 

2 

3 

McKeown  1 

0.01 

39c. 

2 

3 

McKeown  1 

0.1 

39d. 

2 

3 

McKeown  1 

1.0 

39e. 

2 

3 

McKeown  1 

10.0 

39f. 

2 

3 

McKeown  1 

100.0 

39g. 

2 

3 

McKeown  1 

1000.0 

40a.  f 

3 

4 

McKeown  2 

0.001 

40b.  f 

3 

4 

McKeown  2 

0.01 

40c.  f 

3 

4 

McKeown  2 

0.1 

40d.f 

3 

4 

McKeown  2 

1.0 

40e.f 

3 

4 

McKeown  2 

10.0 

40f.f 

3 

4 

McKeown  2 

100.0 

40g.f 

3 

4 

McKeown  2 

1000.0 

41a. 

5 

10 

McKeown  3 

0.001 

41b. 

5 

10 

McKeown  3 

0.01 

41c. 

5 

10 

McKeown  3 

0.1 

41d. 

5 

10 

McKeown  3 

1.0 

41e. 

5 

10 

McKeown  3 

10.0 

41f. 

5 

10 

McKeown  3 

100.0 

41g. 

5 

10 

McKeown  3 

1000.0 

f  In  the  data  defining  this  problem  given  in  McKeown  [1975a]  and  [1975b],  the  matrix 


/  2.95137  4.87407 

B  =  4.87407  9.39321 

\  -2.0506  -3.93189 


-2.0506  \ 
-3.93181 
2.64745  / 


is  in  error  (it  should  be  symmetric).  The  value 


/  2.95137  4.87407 

B  =  4.87407  9.39321 

V -2.0506  -3.93189 


-2.0506  \ 
-3.93189  , 
2.64745  / 


which  is  correct  to  six  decimal  digits,  was  used  in  our  formulation  of  the  problem. 
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Problems  from  DeVilliers  and  Glasser  [1981]  (also  Sal&ne  [1987]) 


n 

m 

starting  value 

42a.° 

4 

24 

DeVilliers  and  Glasser  1 

(1.0,8.0,4.0,4.412) 

42b.° 

4 

24 

DeVilliers  and  Glasser  1 

(1.0, 8.0, 8.0, 1.0) 

42c.° 

4 

24 

DeVilliers  and  Glasser  1 

(1.0,8.0,1.0,4.412) 

42d.° 

4 

24 

DeVilliers  and  Glasser  1 

(1.0, 8.0, 4.0, 1.0) 

43a.° 

5 

16 

DeVilliers  and  Glasser  2 

(45.0,2.0,2.5,1.5,0.9) 

43b.° 

5 

16 

DeVilliers  and  Glasser  2 

(42.0,0.8,1.4,1.8,1.0) 

43c. 0 

5 

16 

DeVilliers  and  Glasser  2 

(45.0,2.0,2.1,2.0,0.9) 

43d. 0 

5 

16 

DeVilliers  and  Glasser  2 

(45.0,2.5,1.7,1.0,1.0) 

43e.° 

5 

16 

DeVilliers  and  Glasser  2 

(35.0,2.5,1.7,1.0,1.0) 

43f.° 

5 

16 

DeVilliers  and  Glasser  2 

(42.0,0.8,1.8,3.15,1.0) 

Problems  from  Dennis,  Gay,  and  Vu  [1985] 


44a.°f 

n 

6 

m 

6 

Exp. 

791129 

starting  value 

(.299,  -0.273,  -.474,  .474,  -.0892,  .0892)1 

44b.°f 

6 

6 

Exp. 

791226 

(-.3,  .3, -1.2, 2.69, 1.59, -1.5) 

44c.°t 

6 

6 

Exp. 

0121a 

(-.041,  .03,  -2.565, 2.565,  -.754,  .754)1 

44d.°t 

6 

6 

Exp. 

0121b 

(-.056,  .026,  -2.991,2.991,  -.568,  .568) 

44e.°  t 

6 

6 

Exp. 

0121c 

(-.074,  .013,  -3.632,3.632,  -.289,  .289) 

45a.° 

8 

8 

Exp. 

791129 

(.299,  .186,  -0.273,  .0254,  -0.474,  -  .0892,  .0892)1 

45b.° 

8 

8 

Exp. 

791226 

(-.3,  -.39,  .3,  -.344,  -1.2,2.69, 1.59,  -1.5) 

45c.° 

8 

8 

Exp. 

0121a 

( - .041,  -.775,  .03,  - .047,  -2.565, 2.565,  - .754,  .754)1 

45d.° 

8 

8 

Exp. 

0121b 

(-.056,  -.753,  .026,  -.047,  -2.991,2.991,  -.568,  .568) 

45e.° 

8 

8 

Exp. 

0121c 

(-.074,  -.733,  .013,  -.034,  -3.632, 3.632,  -.289,  .289) 

t  Variables  1 2  and  24  (6  and  d  in  Dennis,  Gay,  and  Vu  [1985])  are  eliminated  from  the  linear 
constraints  in  order  to  get  the  6-variable  formulation  of  the  problem  (see  Dennis,  Gay,  and  Vu 
[1985]). 

|  Specification  of  some  starting  values  in  Dennis,  Gay,  and  Vu  [1985]  is  incomplete.  The  correct 
values  were  obtained  from  D.  M.  Gay  in  1986. 
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